On the Tails of Web File Size Distributions
نویسندگان
چکیده
Power laws have been observed in various contexts in the Internet. There has been considerable interest in identifying the mechanisms behind these power laws. Most of these have focused on the tail behavior of the distributions. We argue that the the tails and their asymptotic behavior is very hard to substantiate in realistic engineering systems. In this paper we describe some of the proposed mechanisms for producing power law tails. We show that these mechanisms are not particularly robust. Furthermore, we argue that the data ususally available for classifying a distribution is insufficient to classify the tail. Fortunately, the tail has little impact on Internet performance. Thus it is sufficient to focus on mechanisms leading to power law like “waists” of the distributions.
منابع مشابه
Heavy-tailed Probability Distributions in the World Wide Web
The explosion of the World Wide Web as a medium for information dissemination has made it important to understand its characteristics, in particular the distribution of its le sizes. This paper presents evidence that a number of le size distributions in the Web exhibit heavy tails, including les requested by users, les transmitted through the network, transmission durations of les, and les stor...
متن کاملHeavy - Tailed Probability
The explosion of the World Wide Web as a medium for information dissemination has made it important to understand its characteristics, in particular the distribution of its le sizes. This paper presents evidence that a number of le size distributions in the Web exhibit heavy tails, including les requested by users, les transmitted through the network, transmission durations of les, and les stor...
متن کاملAnalysis of the Web Graph Aggregated by Host and Pay-Level Domain
In this paper the web is analyzed as a graph aggregated by host and pay-level domain (PLD). The web graph datasets, publicly available, have been released by the Common Crawl Foundation 1 and are based on a web crawl performed during the period May-June-July 2017. The host graph has ∼1.3 billion nodes and ∼5.3 billion arcs. The PLD graph has ∼91 million nodes and ∼1.1 billion arcs. We study the...
متن کاملHeavy - Tailed Distributions , Generalized SourceCoding and Optimal Web
The design of robust and reliable networks and network services has become an increasingly challenging task in today's Internet world. To achieve this goal, understanding the characteristics of Internet tra c plays a more and more critical role. Empirical studies of measured tra c traces have led to the wide recognition of self-similarity in network tra c. Moreover, a direct link has been estab...
متن کاملModeling Heavy-tails in Traffic Sources for Network Performance Evaluation
Heavy tails in work loads (file sizes, flow lengths, service times, etc.) have significant negative impact on the performance of queues and networks. In the context of the famous Internet file size data of Crovella and some very recent data sets from a wireless mobility network, we examine the new class of LogPH distributions introduced by Ramaswami for modeling heavy tailed random variables. T...
متن کامل